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INTRODUCTION 

The  following  is  a  final  report  for  Air  Force  Contract  AF 49  (638)  - 
1682  covering  the  period  1  February  1966  to  31  July  1970.  The  contract 
supported  research  in  probability  theory  and  the  personnel  involved  were 

i 

Dr.  R.  L.  Adler  and  Dr.  A.  G.  Konheim. 

For  the  most  part  the  activity  supported  dealt  with  two  branches  of 
probability,  one  applied  and  the  other  pure.  The  applied  one  was  concerned 
with  problems  in  the  area  of  queueing  and  scheduling  and  the  pure  one  with 
ergodic  theory.  Dr.  Konheim  conducted  the  research  in  queueing  and 
Dr.  Adler  that  in  ergodic  theory. 
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A  COMPREHENSIVE  REVIEW 
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I.  Ergudic  Theory 

A  central  problem  in  the  study  of  dynamical  systems  is  the  question 
of  similarity,  i.  e.  ,  when  are  two  systems  equivalent  under  a  change  of 
variables.  This  question  which  begins  as  a  problem  in  analysis  has  been 
shown  as  a  result  of  this  research  to  have  its  solution  in  many  instances  in 
the  subject  of  coding  theory.  For  the  continuous  automorphisms  of  the  two- 
dimensional  torus  and  the  discrete  time  dynamical  systems  to  which  they 
give  rise  the  problem  of  equivalence  under  a  change  of  variable  was  thorough¬ 
ly  investigated  in  two  works,  Entropy.  _a  complete  metric  invariant  for  auto¬ 
morphisms  of  the  to rus ,  and  Similarity  of  automorphisms  of  the  torus. 

With  such  systems  the  key  to  finding  whether  a  change  of  variables  (which 
has  only  to  satisfy  the  weakest  of  requirements,  measurability)  exists  lies 
in  the  subject  of  information  theory  and  a  notion  called  entropy.  It  was 
proven  that  two  continuous  ergodlc  automorphisms  of  the  2-torus  are  equiva¬ 
lent  under  measurable  changes  of  variable  if  and  only  if  they  have  the  same 
entropy.  The  method  of  constructing  the  change  of  variable  entailed  rep¬ 
resenting  the  history  of  orbits  of  points  under  the  mapping  as  sequences 
of  symbols  from  a  finite  alphabet.  Entropy  is  a  concept  that  tells  how 
large  such  an  alphabet  should  be  and  with  what  frequency  the  symbols 
should  occur.  The  symbolic  sequences  associated  with  torus  automor¬ 
phisms  satisfy  one-step  transition  rules,  that  is  they  come  from  finite 


I 


!l 


2. 


state  markov  processes.  A  method  of  coding  between  these  symbolic 
sequences  was  developed  that  produces  the  required  transformation 
indicated  in  the  theorem  when  entropy  conditions  are  satisfied. 

The  significance  of  the  above  research  to  the  Air  Force  is  be¬ 
lieved  by  this  writer  to  be  the  following.  Although  direct  applications 
are  removed,  the  subject  of  information  theory  has  been  advanced. 

Some  of  the  coding  results  may  lead  to  contributions  to  the  area  of 
cryptography  and  the  security  of  data  transmission  systems.  In  another 
direction  the  above  work  shows  how  information  theory  might  possibly 
be  applied  to  some  of  the  very  hard  problems  of  dynamical  systems  such 
as  the  n-body  problem  and  stability  behavior  of  fluid  flow. 


II.  Applied  Probability 

The  research  in  this  area  was  devoted  to  problems  of  queueing, 
scheduling,  time  sharing  and  sorting.  In  order  to  determine  optimum 
policies  a  model  has  to  be  set  up  and  its  probabilistic  nature  analyzed. 
A  sample  of  work  in  this  area  is  the  following. 

In  A  Disk  Access  Model  we  consider  a  di  sk  file  and  a  buffer  of 
capacity  b. 
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The  disk  is  divided  into  m  sectors.  When  a  sector  reaches  the  point 
P  data  can  be  read  from  the  buffer  into  the  sector.  The  disk  access 
method  is  characterized  by  a  parameter  r;  the  number  of  requests  to 
be  read  into  a  sector  simultaneously.  The  parameter  r  takes  values 
1,  2,  .  ..  . ,  b.  The  state  of  the  buffer,  at  the  start  of  a  cycle  (as  shown 
above)  is  a  vector 

X  =  (x*1},  x(2\...,  x(m)) 

where  X^  is  the  number  of  requests  for  sector  * i* .  Here 
X(l)>  0  (lli<m;  and  X(  1 )  +  X(2)  +  .  .  .  +  X(m*  =  b.  The  set  of  all 
such  vectors  will  be  denoted  by  If  there  are  t^  requests  for  sector 

i,  when  the  disk  is  in  the  position  to  read  into  sector  i,  then  min 
(r,  tj)  =  Sj  of  them  are  satisfied.  This  removes  s^  requests  from  the 
buffer  and  these  are  immediately  replaced  by  s^  new  requests  (before 
the  disk  reaches  the  position  in  which  it  may  read  into  the  (i  +  l)st 
sector).  They  are  chosen  independently  each  with  the  uniform  distri¬ 
bution  over  the  set  of  integers  {l,  2,  .  .  .  ,  m}.  The  problem  is  to  deter¬ 
mine  the  stationary  distribution  of  the  state  vector  X.  In  this  paper 
we  calculate  the  average  rate  in  which  data  flows  from  the  buffer  to  the 
disk  for  r  -  1  and  r  =  b. 

The  problem  of  sorting  on  a  computer  is  one  of  the  earliest 
problems  in  the  data  processing  field.  Sorting  is  the  operation  of  ar¬ 
ranging  a  sequence  of  records  in  some  order.  One  might  imagine  the 
construction  of  a  large  telephone  directory  from  several  small  ones. 


It  has  been  estimated  that  as  much  as  40%  of  all  time  on  a  computer  in 
nonscientific  applications  is  spent  in  sorting.  In  A  Note  on  Merging 
the  merging  operation  which  occurs  in  sorting  is  studied.  We  imagine 
that  a  set  of  q  'strings'  of  numbers  have  been  generated 

Xi  1  X<  n  <?) 

If  1  If  “  Xf  Xl 

These  'strings'  are  in  their  natural  order,  i.  e. 

X.  ,1  x  <....<  X. 

i,l  —  i,  2  —  i,  n 

and  we  assume  for  simplicity  that  they  are  distinct.  The  q  'strings' 
are  the  result  of  internal  sorting  and  n  should  be  regarded  as  their 
average  length.  The  q  strings  are  to  be  merged  to  form  a  single  list, 
this  list  being  in  the  natu.-al  order.  The  particular  merging  operation 
is  carried  out  on  a  disk  system  and  the  'time'  needed  to  merge  is  an 
idealization  of  the  time  needed  in  the  disk  system.  In  this  note  the  ex¬ 
pected  length  of  time  needed  to  merge  is  determined. 

A  question  that  arises  in  connection  with  the  above  work  on 
sorting  was  dealt  with  in  A  Note  on  Order  Statistics. 

Let  Xj.X^,  ...  be  independent  and  identically  distributed.  The 
order  statistics  (of  a  sample  of  size  n)  is  just  the  random  variables 

X  ,  X,,  .  .  .  ,  X  arranged  in  non-decreasing  order 
\  c  n 

X,  <  X,  < _ <  X 

1,  n  —  2,  n  -  —  n,  n 


Here  X  =  min(X  ,  X, . X  ),  X  =  max  (X  ,  X X  ).  Let 

1,  n  12  n  n,  n  r  2’  ’  n 

F  be  the  common  distribution  function  of  the  X^.  If  F  is  the  uniform 
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distribution  function  (on[0,l]  say)  then 

(*)  E(X.  )  =  — i—  . 

i,  n  +  1 

It  is  natural  to  ask  if  (*)  characterizes  the  uniform  distribution  on 


[0,  1  ].  In  this  note  we  prove  that  the  numbers  {  E(X  )}  de- 

n,  n  n  -  1 


te  rmine  F. 

A  Note  on  Growing  Binary  Search  Trees  is  concerned  with  the 
number  of  search  operations  need  to  locate  an  object  in  a  collection. 

A  binary  search  tree  is  a  tree  (a  graph  without  cycles)  in  which  there 
exists  a  distinguished  vertex  called  the  root.  The  root  has  degree  two 
{two  edges  leave  the  root)  and  all  other  vertices  have  degree  one  or 
three  (one  edge  enters  and  two  leave).  T  will  denote  the  family  of 
binary  search  trees  and  T^  the  subset  of  T  with  n  leaves.  (A  leaf 
of  a  binary  search  tree  is  a  vertex  of  degree  one.  )  If  t  <  T  and  l  is 
a  leaf  then  d^(i  )  is  the  distance  of  the  leaf  I  from  the  root.  The 
average  leaf  distance  is  given  by 


dt(l) 

t 

no.  leaves  in  t 

In  this  note  we  calculate  E(D  :  t*  T  ).  The  probability  measure  on  T 

t  n  n 

is  a  'natural'  measure  which  one  obtains  by  growing  trees  in  T  from 

n 

trees  in  T  .  The  basic  idea  is  to  choose  a  leaf  in  a  tree  of  T 

n- 1  n- 1 

and  co  change  it  into  a  vertex  of  degree  three,  thus  obtaining  an  element 

of  T  . 

n 


t 
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7. 

Finally  Service  in  a  Loop  System  studies  the  service  problem 
in  two  special  data  communications  systems  -  the  loop  and  star  con¬ 
figurations  . 


STAR  SYSTEM 
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The  Loop  System  consists  of  a  main  station  (CPUX,  N  stations 
arranged  on  a  loop  (terminals)  and  a  server  of  capacity  C  (a  time 
multiplexed  line  with  C  frames  per  unit  time  period).  The  server 
makes  a  tour  of  the  N  stations  serving  requests  which  the  stations 
make.  The  system  is  asynchronous  in  the  sense  that  the  full  capacity 
of  the  server  is  available  to  the  first  station  and  thereafter  the  capacity 
offered  to  the  succeeding  stations  is  the  residual  capacity;  the  capacity 
C  reduced  by  the  number  of  requests  served  at  preceeding  stations.  The 
input  processes  to  the  N  stations  are  independent  processes;  each  is  a 
process  with  independent  stationary  increments. 

The  star  system  consists  of  the  N  stations  each  having  a  separate 
channel  to  a  main  waiting  room  (or  buffer).  The  requests  for  service 
from  the  stations  queue  up  in  the  buffer  on  a  first-come-first  served 
basis. 

The  problem  analyzed  in  this  paper  is  the  waiting  line  and  waiting 
time  statistics  for  the  loop  system.  By  specializing  the  parameters  the 
results  for  the  star  system  are  obtained.  In  particular  the  grade  of 
service  from  station  to  station  in  the  loop  is  of  interest. 

The  value  of  this  work  to  the  Air  Force  coincides  with  its  value 
to  the  computer  industry.  Any  research  which  leads  to  more  efficient 
methods  of  data  processing  will  benefit  both  institutions. 
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